AITopics | high-probability complexity

Collaborating Authors

high-probability complexity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

StochasticOptimizationwith Heavy-Tailed Noisevia AcceleratedGradientClipping

Neural Information Processing SystemsFeb-9-2026, 19:01:14 GMT

To illustrate this phenomenon we consider a simple example of stochastic optimization problem andapplySGDwithconstant stepsizetosolveit.

artificial intelligence, arxivpreprintarxiv, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > Russia (0.05)
Asia > Russia (0.05)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping

Neural Information Processing SystemsDec-24-2025, 10:37:25 GMT

In this paper, we propose a new accelerated stochastic first-order method called clipped-SSTM for smooth convex stochastic optimization with heavy-tailed distributed noise in stochastic gradients and derive the first high-probability complexity bounds for this method closing the gap in the theory of stochastic optimization with heavy-tailed noise. Our method is based on a special variant of accelerated Stochastic Gradient Descent (SGD) and clipping of stochastic gradients. We extend our method to the strongly convex case and prove new complexity bounds that outperform state-of-the-art results in this case. Finally, we extend our proof technique and derive the first non-trivial high-probability complexity bounds for SGD with clipping without light-tails assumption on the noise.

accelerated gradient clipping, heavy-tailed noise, stochastic optimization, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

abd1c782880cc59759f4112fda0b8f98-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 17:54:55 GMT

complexity, optimization, stochastic gradient, (12 more...)

Neural Information Processing Systems

Country:

Europe > Russia (0.14)
Asia > Russia (0.14)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

High-probability complexity bounds for stochastic non-convex minimax optimization

Neural Information Processing SystemsMay-27-2025, 22:10:38 GMT

Stochastic smooth nonconvex minimax problems are prevalent in machine learning, e.g., GAN training, fair classification, and distributionally robust learning. Stochastic gradient descent ascent (GDA)-type methods are popular in practice due to their simplicity and single-loop nature. However, there is a significant gap between the theory and practice regarding high-probability complexity guarantees for these methods on stochastic nonconvex minimax problems. Existing high-probability bounds for GDA-type single-loop methods only apply to convex/concave minimax problems and to particular non-monotone variational inequality problems under some restrictive assumptions. In this work, we address this gap by providing the first high-probability complexity guarantees for nonconvex/PL minimax problems corresponding to a smooth function that satisfies the PL-condition in the dual variable.

high-probability complexity, minimax problem, stochastic non-convex minimax optimization, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.72)

Add feedback

Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping

Neural Information Processing SystemsOct-11-2024, 02:10:48 GMT

accelerated gradient clipping, heavy-tailed noise, stochastic optimization, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise

Gorbunov, Eduard, Danilova, Marina, Shibaev, Innokentiy, Dvurechensky, Pavel, Gasnikov, Alexander

arXiv.org Artificial IntelligenceAug-30-2024

Stochastic first-order methods are standard for training large-scale machine learning models. Random behavior may cause a particular run of an algorithm to result in a highly suboptimal objective value, whereas theoretical guarantees are usually proved for the expectation of the objective value. Thus, it is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth stochastic convex optimization have complexity bounds with the dependence on the confidence level that is either negative-power or logarithmic but under an additional assumption of sub-Gaussian (light-tailed) noise distribution that may not hold in practice. In our paper, we resolve this issue and derive the first high-probability convergence results with logarithmic dependence on the confidence level for non-smooth convex stochastic optimization problems with non-sub-Gaussian (heavy-tailed) noise. To derive our results, we propose novel stepsize rules for two stochastic methods with gradient clipping. Moreover, our analysis works for generalized smooth objectives with H\"older-continuous gradients, and for both methods, we provide an extension for strongly convex problems. Finally, our results imply that the first (accelerated) method we consider also has optimal iteration and oracle complexity in all the regimes, and the second one is optimal in the non-smooth setting.

batch size, gradient, probability, (16 more...)

arXiv.org Artificial Intelligence

2106.05958

Country:

Asia > Russia (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback